Machine learning models using non-linear techniques improve the prediction of resting energy expenditure in individuals receiving hemodialysis

Abstract Purpose Approximately 700,000 people in the USA have chronic kidney disease requiring dialysis. Protein-energy wasting (PEW), a condition of advanced catabolism, contributes to three-year survival rates of 50%. PEW occurs at all levels of Body Mass Index (BMI) but is devastating for those people at the extremes. Treatment for PEW depends on an accurate understanding of energy expenditure. Previous research established that current methods of identifying PEW and assessing adequate treatments are imprecise. This includes disease-specific equations for estimated resting energy expenditure (eREE). In this study, we applied machine learning (ML) modelling techniques to a clinical database of dialysis patients. We assessed the precision of the ML algorithms relative to the best-performing traditional equation, the MHDE. Methods This was a secondary analysis of the Rutgers Nutrition and Kidney Database. To build the ML models we divided the population into test and validation sets. Eleven ML models were run and optimized, with the best three selected by the lowest root mean squared error (RMSE) from measured REE. Values for eREE were generated for each ML model and for the MHDE. We compared precision using Bland-Altman plots. Results Individuals were 41.4% female and 82.0% African American. The mean age was 56.4 ± 11.1 years, and the median BMI was 28.8 (IQR = 24.8 − 34.0) kg/m2. The best ML models were SVR, Linear Regression and Elastic net with RMSE of 103.6 kcal, 119.0 kcal and 121.1 kcal respectively. The SVR demonstrated the greatest precision, with 91.2% of values falling within acceptable limits. This compared to 47.1% for the MHDE. The models using non-linear techniques were precise across extremes of BMI. Conclusion ML improves precision in calculating eREE for dialysis patients, including those most vulnerable for PEW. Further development for clinical use is a priority.

Previous investigators have shown that disease-specific variables, such as inflammation, blood glucose levels and renal biomarkers are predictive of Ree in individuals receiving dialysis [21,23].
to date, dialysis-specific ees have improved predictive accuracy by achieving around 60% precision (± 10% of zero difference from mRee) when tested in samples of people with similar characteristics to the development group [20][21][22][23][24][25]. however, our group has demonstrated that disease-specific ees may not perform well when transferred to different geographical or demographic samples, or to those people with outlying characteristics, such as very low or high body mass index (BMi) [27]. arguably, such individuals may be the most vulnerable.
in recent years, more sophisticated machine learning (Ml) techniques have been applied to medical diagnostics for hemodialysis patients. these methodologies can improve the prediction of clinical outcomes (such as disease progression and mortality) using common clinical factors [28][29][30][31][32]. Ml takes a very different approach to traditional linear regression [33,34]. it engages the variables without preconception to determine algorithmic rules from inherent patterns in that data [34]. this is achieved by randomizing large sets of variables into combinations of linear and non-linear models and can select features that yield complex and nuanced interactions [34,35]. in addition, increases in computational power have tremendously improved the researcher's ability to process large samples of data in vastly more complex and accurate ways [34,35].
it is our hypothesis that such techniques may also be applied to clinical nutrition. For example, using 114 individuals, Ponce et al. applied Ml techniques to predict Ree in patients with acute kidney injury (aKi) receiving dialysis [28]. this group used a combination of linear and non-linear regression models including linear Regression with stepwise selection, linear Regression with Regularization, RPaRt, support Vector Machine with Radial Kernel, Generalized Boosting Machine, extreme Gradient Boosting and Random Forest [28]. the best model (Random Forest) predicted Ree with 69% accuracy compared to 24% for the harris-Benedict equation [28].
the Rutgers Nutrition and Kidney Database (RNKD) is a large renal database in the Usa, containing over 600 clinical and demographic variables gathered for 210 hemodialysis patients over 4 separate studies undertaken between 2012 and 2018 [27]. Using this database, the primary objective of this pilot study was to ascertain if a machine learning approach alone can generate a more precise estimation of Ree than previous statistical methods using linear and logistical regression only [20][21][22][23][24][25]. a secondary purpose was to clinically consider the features selected by the best-performing Ml models, to guide the direction of future research.

Study design
this was a secondary analysis of an existing database, the RNKD. the RNKD is an amalgam of four existing studies conducted from 2012 to 2018 [36][37][38][39]. the studies were all undertaken in the Northeast and Midwest regions of the United states and only included people receiving maintenance hemodialysis (MhD), 3 times a week for at least 3 months. sampling for enrollment was conducted by convenience, meaning that individuals in dialysis clinics were asked to volunteer.
inclusion and exclusion criteria were similar in every study and described in detail by Byham-Gray et al. [21]. Participants were women and men greater than 18 years with stage 5 cKD receiving MhD (conventional method, in-center at a for-profit dialysis unit) 3 times per week for at least 3 months. exclusion criteria included infective complications or poorly healing wounds, surgical procedures or cardiovascular events within 30 d of enrollment, recreational pharmaceutical usage, frequent ingestion of dietary supplements, a previous diagnosis of heart failure, hepatic disease, or cancer.

Data collection
all the studies in the RNKD used data-gathering protocols that were alike. the individuals and their medical records provided demographic data. anthropometric and clinical data were gathered on a day free of dialysis. Ree was determined by indirect calorimetry using a metabolic cart (cosmed Quark RMR®, Rome, italy). Participants were requested not to exercise vigorously and fast for 12 h before the assessment. if a 12 h fast was not possible then a 4 h fast was requested. the fast was introduced to reduce fluid accumulation and its impact on body weight and composition. ic took place before 12 pm. Participants lay still and awake for at least twenty minutes. the measurement protocol was adopted as previously defined by Olejnik et al. [36] and is fully described in previous studies published by this group [21,23,27].

Data mining
the RNKD is stored on a password-protected sPss file on the Rutgers Box platform. after approval from the Rutgers University institutional Review Board (Protocol number: Pro2020001656), the dataset was extracted and delivered as a separate sPss file. all the data were de-identified before further analysis. they were stored on a password-protected laptop and shared only with co-investigators over encrypted email or cloud. as this was a secondary analysis, no further permissions were required.

Measurement of estimated resting energy expenditure via machine learning models
in this study, we provided the largest group of variables possible to the Ml models and allowed the models to effectively choose the predictive features for themselves. the RNKD was screened and cleaned for inaccuracies and assessed by two renal dietitians. Variables were excluded if found to be clinically irrelevant to human metabolism, statistically insignificant, or where insufficient values were available to maintain the integrity of the dataset. For the construction of the models, individuals were split between a training set (80% of cases) and a validation set (20% of cases). although the case selection was random, each set aimed to maintain the BMi distribution of the entire sample so long as the data permitted, as this was pertinent to our ultimate assessment of precision. Preprocessing procedures included assessment of multicollinearity, box and cox transformations, centering and scaling predictors and creating dummy variables constructed based on linear and non-linear combinations of existing variables to assess such combined effects. Where it was deemed appropriate to impute missing variables this was done via adding the mean value applied across the variable's present values.
in total, eleven Ml models were developed in the training set using linear and non-linear regression, both internally in the Ml regression techniques and in combining variables in each model. the selection included Bayesian Ridge, elastic Net, Gradient Boosting Regressor, lasso, linear Regression, linear sVR (support vector regression), MlP (multi-layer perception) Regressor, Random Forest, Ridge Regression, sGD (stochastic gradient descent) Regressor and sVR. Given the large number of features present in the dataset (435) vs. the number of patients, a feature selection technique based on optimization of the target root mean squared error (RMse) and R squared (R 2 ) through progressively eliminating the least impactful feature was developed. Variables were excluded from each model one by one if this lowered the RMse. the breakpoint was set when marginal exclusion increased RMse, at which point the number of features was set. the performance of each model was assessed in the validation set using relative analysis of the highest R 2 and the lowest RMse. We selected the best three models prioritizing RMse as this reflects the lowest average divergence from mRee in kcal. We then used the best three models to generate estimates of Ree in the validation set.

Measuring estimated resting energy expenditure via predictive equation
For this study, the best model of the Maintenance hemodialysis equation (including c-reactive protein {cRP}) was used (MhDe-cRP) [27]. the variables used to create values for eRee were age, sex, weight, and cRP. although not all individuals in the validation group had values for cRP, we assessed that sufficient real values existed, and that imputation of the rest would not substantially alter the explanation of variance. Missing values were imputed using the median cRP for the entire sample (training + validation sets), which reflected the distribution of values within our dialysis population at large.

Statistical and graphical analyses
No power analysis was undertaken in this study as it was previously included when the MhDe was constructed by this research group [21]. at that time, n = 60 was adequate for equation building and n = 95 was adequate for validation of the equation. Our latest study included 167 individuals from the same dataset in our Ml analysis, hence further investigations on sample size were deemed unnecessary. Furthermore, statistical significance was demonstrated for the relevant findings, indicating that the study utilized a sufficient sample size.
Ml models were developed, and analysis performed, using Python (version 8.4.0) and sci-Kit (version 1.1.1) learn package. statistical analyses were performed using statistical Package for social sciences (sPss, iBM corp., version 27, armonk NY). if values were found to be normal via visual inspection, they were expressed as mean and standard deviation (sD). if not normal, values were stated as median, 25th and 75th percentiles. an intraclass correlation coefficient (icc) was calculated to analyze the reliability of each equation using a model with a single rater, 2-way mixed-effects and absolute agreement [37]. Alpha priori was established at 0.05.
We used a modified Bland-altman plot to measure the levels of agreement between mRee and eRee from each model [38]. the original Bland-altman plot graphically assesses agreement between two methods of measurement by examining one method on the Y-axis by comparison with either the true measure on the X-axis or the mean of both measures if the criterion is not known [38]. in this case, we used residual values calculated via percentage on the Y-axis and mRee (the criterion measure) on the X-axis. a full description of the method was previously published by this group [27]. limits of agreement for predictive equations have been established at ± 10% from zero difference from mRee in the nutrition literature [38]. those limits have been used for validation by Byham-Gray et al. [21,23], Morrow et al. [25] and Bailey et al. [27] when assessing equations for people receiving dialysis [21,23,25,27]. this graphical analysis was applied to each of the best models (and the MhDe) across the complete validation sample for which Ree was generated. the analysis was subsequently repeated with the validation set divided into subgroups of BMi. individuals with a BMi less than 24.9 kg/m 2 , 25-29.9 kg/m 2 , or ≥ 30 kg/m 2 were categorized as underweight/normal weight, overweight, or obese.

Clinical and narrative analysis
We categorized the features selected by the best models into groupings to consider their clinical significance. these groupings were, demographic, anthropometric, disease-related, dynamic/clinical, patient-reported and provider-assessed. We then assessed the distribution of features amongst the groups to narratively identify trends that may assist future researchers.

Results
in total, 167 of the individuals retained sufficient variables for this study. the population was 58.7% male, 82% african american, and 80.2% Non-hispanic (table 1). ages ranged between 21.5 and 80.7 years. the mean age was 56.4 ± 11.4 years (table 2). the median BMi for the group was 28.8 (iQR = 25.8-34.0) kg/m 2 . 25.7% of individuals were categorized as underweight or normal weight, 35.3% as overweight, and 39.0% as obese. the sample was randomly split into 80% training sample and 20% validation sample while maintaining the BMi stratification constraint. there was no statistical difference in the frequencies of sex, race, ethnicity and BMi between the total and the validation samples.

Selecting the most accurate machine learning models to predict energy requirements
eleven Ml models were run and optimized within the training set (N = 133) to predict Ree. Of the full dataset, 43 subjects and 171 variables were omitted due to significant missing data. 188 variables were excluded as they were not deemed clinically relevant, and 11 variables were omitted from modelling as they were not statistically relevant. in total, the optimized models selected 55 features with an individual model range between 8 and 41 features (table 3).
the three best models were selected because they exhibited the lowest RMse (kcal) from mRee. these  (Figures 1-4).

Variability of agreement in different categories of BMI
For participants with obesity, the sVR Ree and linear Ree showed the same levels of accuracy (84.6% within limits) (table 6) and elastic Net Ree predicted 76.9% of estimates within acceptable limits. For all the Ml models, the values outside of limits were underestimated ( Figure 5(b-d)). the MhDe Ree demonstrated only 46.2% accuracy for obese persons, with inaccurate estimates split evenly between over and underestimation ( Figure 5(a)). For participants who were overweight, accuracy was higher for the linear Regression Ree (100.0% within limits), closely followed by the sVR Ree and elastic Net; both 92.3% within limits. (table 6 and Figure 6(b,c)) in this subgroup, the MhDe Ree performed with greater accuracy than for the total group (Figure 6(a)). again, the Ml models tended to underestimate when inaccurate and the MhDe Ree tended to overestimate eRee where values were out with the limits of agreement. (Figure 6(a-d)) For individuals who were of normal weight or underweight, none of the values for the MhDe Ree reached the threshold for agreement and 87% of the values underestimated energy expenditure (table 7 and Figure 7(a)). the linear Regression Ree achieved 50% of values and the elastic Net Ree achieved 62.5% of values within acceptable limits (Figure 7(c,d)). the sVR Ree performed best with 100% of estimates within acceptable limits (Figure 7(b)).

Feature selection
the Ml models selected 55 specific features with a very wide range of correlation to mRee (r = 0.77 − 0.005).  in general, anthropometric features tended to show the highest association with mRee. the top 5 correlated features were lean body mass, weight in kg, dry weight 6 months previously, intradialytic weight gain and height in cm. as the modeling technique eliminated features deemed to be colinear, only two of the top features were employed by all three of the best models (lean body mass and intradialytic weight   gain). the best models selected 48 features in total of which 16 were constant to all, 14 were common to two models and 18 were selected by only 1 model (table 7). the best model (sVR) had the most individually selected features. the sVR also demonstrated the most even split of features among the defined demographic, anthropometric and clinical categories ( Figure 8).

Discussion
Our previous research demonstrated that traditional linear regression methods for predicting Ree in patients with cKD are insufficiently accurate [27]. this is especially true for more vulnerable individuals as demonstrated by extremes of BMi [27]. in this pilot study, we evaluated the efficacy of Ml models to better predict Ree using sophisticated techniques. although the validation sample was small, all best three models achieved improved precision over the best predictive equation in this population, the MhDe-cRP. Moreover, two of the Ml models, sVR and elastic Net, demonstrated markedly better predictive ability across the three subgroups of BMi (underweight/ normal weight, overweight, obese).

Protein energy wasting, a clinical challenge
it is estimated that up to 75% of patients treated with dialysis suffer from PeW, a unique nutritional condition that is a separate and strong risk factor for poor health sequelae and mortality [10,40]. however, identifying people with PeW can be difficult. For example, PeW can commonly occur at all levels of BMi, including in obesity, where it can be difficult for providers   to pinpoint symptoms of muscle catabolism [41]. Furthermore, traditional methods for assessing malnutrition, such as the Subjective Global Assessment perform sub-optimally in specifically highlighting PeW in this population [40]. the ability to identify accurate energy expenditure in all individuals treated with dialysis may address diagnostic failings and provide a critical first step in selecting an appropriate care plan. in addition to a restrictive diet, often disrupted by dialysis treatment, several factors contribute to PeW  [10]. Metabolic derangements such as acidosis subdue the anabolic action of insulin and can promote the oxidation of amino acids [42,43]. Pro-inflammatory drivers from disease, dialysis treatment, access and the poor biocompatibility of dialysis methods can have a deleterious impact on appetite and directly exacerbate muscle catabolism [10,[44][45][46][47]. Furthermore, hormonal changes resulting from comorbidities such as diabetes contribute to the loss of lean body mass [48]. the process of PeW is both complex and dynamic, requiring multiple strategies of nutritional, medical and lifestyle intervention [10,[45][46][47]. as improvements in dialysis treatment (such as improved filtration and biocompatibility of medical materials) continue to evolve, dynamic methods to assess their application are increasingly appropriate [44][45][46][47] For providers to accurately assess Ree, and its changes over time would provide a powerful tool in monitoring the ongoing need and effectiveness of such interventional strategies.

Demographic features
all existing equations predicting Ree incorporate age and sex, as these are highly correlated with metabolic output [15,16,[20][21][22][23]. interestingly, the best three models selected these features stochastically and neither were included by the sVR. this can be partially explained by the inclusion of lBM and FFM as variables in the dataset, which build sex into the hume and Deurenberg equations [49,50]. the omission of age by all but one of the best models is, however, a finding that begs further explanation. this may be consistent with our hypothesis that interactions of features are as important as strong correlations with mRee. For example, it may be that the sVR was able to establish the impact of age from its effects on other features. in our previous analysis of predictive equations from different geographical samples, we hypothesized that racial differences in model building have an impact on estimated Ree [27]. in this study, both non-linear models selected race as a feature, and one also selected ethnicity.

Disease-related features
Previous authors have found conflicting evidence regarding the impacts of clinical and disease factors on Ree for dialysis patients [20][21][22][23]. Byham-Gray et al. established the importance of clinical biomarkers of inflammation (cRP) diabetes (hemoglobin a1c) and muscle catabolism (serum creatinine) in equation building [21]. Fernandes et al. also found a correlation between inflammation and mRee but did not determine that cRP explained Ree variance in their sample [20]. in the best three Ml models, disease-related, clinical biomarkers and vital signs constituted the largest grouping of individual features. all three of the best models selected length of dialysis treatment, type of dialysis access, diabetes medication (insulin, or oral), anti-inflammatory medication, etiology of hypertension, intradialytic weight gain and heart rate. these factors include a wide range of variables across the spectrum of cKD complications. Of note, many of the compound issues discussed in the PeW literature are represented in the list, including markers of diabetes, inflammation and type of dialysis access [10]. indeed, it intuitively makes sense that if up to 75% of the dialysis population may be suffering from some degree of PeW, then the Ml models will select PeW-relevant features in determining the metabolic drivers of Ree. Our findings also agree with Ponce et al. who discovered that several disease-related, medical and dynamic factors (airway pressure, minute volume) were strong predictors in the best model for critically ill patients with aKi [28]. although chronic cKD and critical aKi seem at opposite ends of the kidney-disease spectrum, they share the symptoms of profound metabolic alteration due to extensive medical complications.

Patient-reported features
another finding was the abundance of patient-reported variables related to appetite. the best models all selected 'enjoy mealtimes' 'weekly appetite rating' 'appetite rating' and 'daily appetite. ' additionally they variably selected another 8 items related to appetite, intake and mealtime enjoyment, on or after dialysis treatment. From a clinical perspective, appetite loss has long been established as a common occurrence in patients receiving dialysis, where the treatment burden leads to substantial fatigue [51]. Moreover, poor appetite is associated with elevated levels of inflammatory cytokines and is a reliable marker of the proinflammatory state [52].

Provider-assessed features
We also observed that elements of malnutrition screening were commonly selected by the best three models, including three sections of the Subjective Global Assessment [53]. these included the physical examination, functional abilities and gastrointestinal symptoms. all these variables give information about an individual's functional outputs in terms of symptoms of frailty, energy utilization and the altered faculty to eat [53]. although the sGa has been demonstrated to be a poor diagnostic for PeW directly, it may provide valuable information as part of a more comprehensive investigation [39].

Assessing the impact of non-linear modelling techniques
We hypothesized that by using non-linear modelling techniques, greater precision would be achieved in predicting Ree across categories of BMi. the relationship of several factors influencing Ree are known to be non-linear, the most fundamental being the interaction between height and weight [20][21][22][23]27,28]. in our previous research, we demonstrated that equations using linear techniques will perform with less precision at extremes of BMi as they fail to account for the changing relationship of height and weight along the correlation curve [27]. Ponce et al. observed that when ranking several Ml models for the prediction of Ree in patients with aKi, the non-linear models performed with greater accuracy than the linear models [28]. this study confirms these findings. Of our best three models, two were non-linear. additionally, when the best models were assessed in subgroups of BMi, the non-linear models performed equally well for those individuals with the highest and lowest BMi. For example, the sVR Ree predicted 100% of values within acceptable limits for the lowest subgroup of BMi and 85% of values within limits at the highest levels of BMi. in the past, where regression analysis has been used to create predictive equations, it has been argued that simplicity in the algorithm is a major consideration for use in clinical practice [20][21][22][23]. this has led authors to reject non-linear methods due to the difficulty of manual calculation. as almost all medical calculations are available online, or as apps on mobile devices, we argue that 'use with a pocket calculator' is an issue now largely redundant. Furthermore, the ability to get a large number of biomarkers from the patients' electronic health records leads the way to easily embedding Ml-based techniques for the prediction of Ree.

Limitations of the study
the original studies in the RNKD were convenience-sampled in the Northeast and Midwest regions of the Usa and hence the population was not as diverse as the national average. additionally, those studies imposed strict medical criteria which resulted in the omission of sicker individuals. Many key variables (anthropomorphic and ic) were gathered on a non-dialysis day. this could affect a post-dialysis weight and BMi, dependent on an individual's fluid intake and residual renal excretion. Only conventional hemodialysis was undertaken in the original studies. this gives limited insight into the clinical feature differences that may be attributable to peritoneal dialysis or more advanced techniques (such as hemodiafiltration or expanded hemodialysis). Future research should undertake a more comprehensive review of dialysis procedures. For the purpose of this study, certain variables were omitted from the Ml dataset to preserve the number of subjects available for training and validation. this includes key clinical markers such as cRP, hemoglobin a1c and serum creatinine which have been previously shown to correlate with mRee.
Notwithstanding the omissions of variables, the validation set only comprised of 34 individuals, which represents a small sample size. Finally, the best model (sVR) gave substantially improved precision and a glimpse into the features that may contribute. however, the model does not generate an equation and is, therefore, less interpretable as to the direction of effect.

Implications for practice and research
the application of a precise algorithm for predicting Ree in patients receiving dialysis could present a powerful tool for providers to implement and monitor nutritional, medical and physical interventions to mitigate PeW. current predictive equations provide inadequate precision and lack the scope to model clinical changes in metabolic needs. this is the first study to use Ml to predict Ree in this patient population with results that suggest a potential step change. however, this was a pilot study utilizing an existing database. to preserve the maximum sample size, some key clinical and functional biomarkers were not presented to the Ml models. a research priority would be to expand the dataset to include the missing data and further explore interactions. although our best model demonstrated high precision, it used many esoteric variables to generate accuracy. this necessarily limits the direct applicability to the clinical setting. a necessary next step is to identify the best dialysis biomarkers that could approximate the precision, and which patient-focused questions may help fill in the gaps. thereafter the analysis should include people from different geographical locations.
Ml is a process best applied to clinical spaces rich in data. a further application in the clinical nutrition field is in critical care, where data points are gathered throughout the day, precise calculation of Ree is vital and Ree is labile depending on the patient's medical progress. another related field is exercise physiology where the measured inputs of athlete nutrition, lifestyle and training schedules may shed light on the metabolic outputs of lean body mass, Ree and performance.

Conclusion
Machine learning models using non-linear techniques potentially provide a step-change in predicting Ree in individuals with cKD. Feature selection by the Ml models suggest that many contributing medical factors of PeW explain the variability of Ree. such information could reveal cost-effective strategies to benefit millions of people worldwide. Further research in this area is a priority.

Author contribution
alainn Bailey and laura Byham-Gray initially conceived and designed this study. suril Gohel and Mohamed eltawil selected and built the machine-learning models. alainn Bailey, laura Byham-Gray, Mohamed eltawil and suril Gohel contributed to the appropriate selection of data, statistical analysis and interpretation of the results. alainn Bailey produced the draft paper, and it was revised for intellectual content by alainn Bailey, laura Byham-Gray, suril Gohel and Mohamed eltawil. all the authors agree to be accountable for all aspects of the work.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
this study was supported by funding from the National institute of health, mechanisms [1R15DK090593 01a1; 3R15DK090593 02; 6R15DK09059302], ahRQ mechanism [1K8hs023434 01a1], and from funding from the academy of Nutrition and Dietetics, and the Rutgers intramural school of health Professions Grant Program.

Data availability statement
the data that support the findings of this study are available on request from the corresponding author, laura Byham-Gray, and require a data-sharing agreement. the data are not publicly available due to restrictions, i.e. the data may contain information that could compromise the privacy of research participants.